ToMUO®!] 3 © 001] 
PLANT SCIENCE 



ORIGINAL RESEARCH ARTICLE 

published: 21 May 2014 
doi: 10.3389/fpls.2014.00214 




The splicing fate of plant SP011 genes 

Thorben Sprink * and Frank Hartung 

Biosafety in Plant Biotechnology, Julius Kuhn Institute, Quedlinburg, Germany 



Edited by: 

Changbin Chen, University of 
Minnesota, USA 

Reviewed by: 

Paul Fransz, University of 
Amsterdam, Netherlands 
Joann Mudge, National Center for 
Genome Resources, USA 

'Correspondence: 

Thorben Sprink, Biosafety in Plant 
Biotechnology, Julius Kuehn 
Institute, Erwin-Baur Str. 27 
Quedlinburg 06484, Germany 
e-mail: thorben. sprink@jki.bund. de 



Toward the global understanding of plant meiosis, it seems to be essential to decipher 
why all as yet sequenced plants need or at least encode for two different meiotic SPOll 
genes. This is in contrast to mammals and fungi, where only one SP011 is present. Both 
SP011 in Arabidopsis thalianaare essential for the initiation of double strand breaks (DSBs) 
during the meiotic prophase. In nearly all eukaryotic organisms DSB induction during 
prophase I by SP011 leads to meiotic DSB repair, thereby ensuring the formation of a 
necessary number of crossovers (CO) as physical connections between the homologous 
chromosomes. We aim to investigate the specific functions and evolution of both SP011 
genes in land plants. Therefore, we identified and cloned the respective orthologous 
genes from Brassica rapa, Carica papaya, Oryza sativa, and Physcomitrella patens. In 
parallel we determined the full length cDNA sequences of SP011-1 and -2 from all 
of these plants by RT-PCR. During these experiments we observed that the analyzed 
plants exhibit a pattern of alternative splicing products of both SP011 mRNAs. Such an 
aberrant splicing has previously been described for Arabidopsis and therefore seems to 
be conserved throughout evolution. Most of the splicing forms of SP011-1 and -2 seem to 
be non-functional as they either showed intron retention (IR) or shortened exons. However, 
the positional distribution and number of alternative splicing events vary strongly between 
the different plants. The cDNAs showed in most cases premature termination codons 
(PTCs) due to frameshift. Nevertheless, in some cases we found alternatively spliced 
but functional cDNAs. These findings let us suggest that alternative splicing of SP011 
depends on the respective gene sequence and on the plant species. Therefore, this 
conserved mechanism might play a role concerning regulation of SP011. 
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INTRODUCTION 

In most eukaryotic organisms the rearrangement of the parental 
alleles by homologous recombination during meiosis is one essen- 
tial step leading to genetic diversity. Correct pairing and subse- 
quent homologous recombination in prophase I ensure stability 
of the chromosome number on the one hand and variability in the 
developing cells due to crossover resolution resulting in exchange 
of genetic material between the homologous chromosomes on the 
other hand. One crucial aspect in the arrangement of the recom- 
bination progress is the initial formation of double strand breaks 
(DSBs) by SPO 1 1 . The eukaryotic SPO 1 1 , which shows homology 
to the archaeal Topoisomerase VIA sub unit (TOPVIA), is one of 
the key factors mediating the formation of DSBs in a wide range of 
organisms (Bergerat et al., 1997; Keeney et al, 1997; Grelon et al., 
2001). Without DSBs and their subsequent repair as crossovers 
there is no physical linkage between the homologous chromo- 
somes and random chromosome distribution would appear (Cole 
et al, 2010). Like TOPVIA, SPOll is able to cleave DNA via a 
5' phosphotyrosyl linkage thereby defining the acceptor sites of 
exchange between the parental genomes (Cole et al, 2010). In 
contrast to animals and fungi where a single SPOll is sufficient 
for meiotic DSB formation, plants encode for at least two SPOl 1, 
referred to as SPOl 1-1 and -2, that are both essential in a func- 
tional protein form for DSB formation during meiosis (Keeney 



et al, 1997; Grelon et al., 2001; Hartung et al, 2007; Shingu 
et al., 2012). However, the mechanism by which two very different 
SPOll proteins in plants induce DSBs specifically during meiosis 
is still unclear. Our long term aim is to investigate the specific 
functions, origin and evolution of each SPOll in the plant king- 
dom. By analyzing complete genomic sequences of more than 40 
plants, we were able to show that all as yet sequenced land plants 
encode for at least three SPOl 1 genes. Two of them, AthSPO 11-1 
and -2 play a meiotic role. The third one, AthSPO 11-3 together 
with TOPVIB, the second subunit of the topoisomerase, possesses 
essential functions during somatic development of plant cells but 
plays no role in meiosis (Hartung et al., 2002a, 2007; Stacey et al., 
2006; Simkova et al, 2012). 

The phylogenetic analyses of SPOl 1-1 and -2 in land plants 
and algae show very clearly that both genes are highly conserved 
and ancient in the lineage of plants but cannot be found in algae 
or protists in the same form. An analysis of a high number of 
available genomic and protein sequences of SPOll in virtually 
all kingdoms of life shows that at least one duplication of the 
original SPOl 1 from archae must have occurred very early pre- 
ceding the split of animals and plants (Malik et al., 2007; this 
work). In addition to this, the intron content and localization in 
the SPOl 1 genes from different organisms shows ancestral con- 
servation between animals, fungi, and plants but also dramatic 
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variations in protists and green algae (Hartung et al., 2002b; this 
work). 

Early investigations of SPOll-1 expression in Arabidopsis 
thaliana exhibited an extensive pattern of alternative splicing, 
which we were now able to show also for SPOll-2 (Hartung 
and Puchta, 2000). Analyzing the expression in other plants we 
could identify various non-functional alternative splicing events 
for SPOll-1 and -2 in Oryza sativa, Brassica rapa, Carica papaya, 
and Physcomitrella patens. Additionally, we found putative func- 
tional forms of alternative spliced SPOll-1 or -2 for the first time 
in plants, namely in B. rapa, C. papaya, O. sativa, and P. patens. 
The fact that both SPOl 1 show such a diversified splicing pattern 
and that alternative splicing for both SPOl 1 is conserved between 
the different species indicates that SPOll has an ancient complex 
transcriptional regulation mechanism, most probably involving 
the non-sense mediated decay pathway as described for other 
meiotic genes (Chiba and Green, 2009). 

MATERIALS AND METHODS 
ACCESSION NUMBERS 

We sequenced the cDNA of SPOll-1 and SPOl 1-2 from B. rapa, 
C. papaya, and P. patens. The resulting sequences have been 
deposited in this order in the NCBI database under accession 
numbers KF841348, KF841349, KF841350, KF926859, KF926860, 
and KF926861. 

PLANT MATERIAL AND GROWTH CONDITIONS 

Arabidopsis {Arabidopsis thaliana L.) wild type plants (Col-0) 
were seeded on a 3:1 mixture of soil and vermiculite spiked with 
4 g/1 Plantacote (Wilhelm Haug GmbH und Co. KG, Ammerbuch, 
Germany) as fertilizer and 0, 4 g/1 Exemptor (BAYER crop sci- 
ence, Langenfeld, Germany) as an preventive insecticide. Plants 
were kept under short day conditions (8-h light/16-h dark cycle 
at 18°C) for 3 weeks and then transferred to a green house under 
a long day regime (16-h light/8 h- dark at 20°C). Rice (O. sativa 
subsp. Japonica) plants were grown in the greenhouse under a long 
day regime as well as B. rapa var. fastplant. Papaya (C. papaya 
L.) trees were grown in a public tropical greenhouse on loamy 
soil. P. patens gametophores were kindly provided by Gertrud 
Wiedemann from the group of Ralf Reski (Freiburg, Germany) 
on solid media. 

GENE COMPILATION AND SOURCE OF SEQUENCE DATA 

A total of 42 SPOl 1-1 and 39 SPOl 1 -2 sequences from land plants 
were extracted from different databases using the Arabidopsis 
and O. sativa orthologs as starting point. The databases used 
were: Phytozome (http://www.phytozome.net), JGI (http://www. 
jgi.doe.gov), Ensembl plants (http://plants.ensembl.org/index. 
html), Gramene (http://www.gramene.org/), CoGeBlast (http:// 
genomevolution.org/r/5kv5), and NCBI (http://www.ncbi.nlm. 
nih.gov/genomes/PLANTS/PlantList.html). Models predicting 
not the full length cDNA but only a few assembled ESTs were 
manually curated by aligning these sequences to annotated 
SPOll-1 and -2 of A. thaliana as well as O. sativa using MegAlign 
(DNASTAR Inc. Madison, WI, USA). For some species the ESTs 
and the cDNA prediction did not cover the whole sequence. In 
these cases, the corresponding genomic DNA region was screened 



for possible matches and manually added to the model if possible. 
To check the accuracy of our prediction, elected coding sequences 
(CDS) were amplified using Primers covering the whole predicted 
CDS (Supplemental Table 1). The sequence of each gene was 
checked by sequencing, using the Sanger method (GATC Biotech 
AG, Konstanz, Germany). All sequences used for phylogenetic 
comparisons and their accession codes are listed in Supplemental 
Tables 2, 3. 

RNA ISOLATION AND USED TISSUE 

All kits used in this study were used following the instructions 
of the manufacturer. Total RNA was isolated using the Bio 8c Sell 
RNA mini Kit (Bio8cSell e.K., Feucht, Germany). To evaluate the 
abundance of SPOll transcripts in generative tissue, fresh young 
flowers were used for RNA isolation. In the case of C. papaya, 
flowers were stored in RNAshield (Zymo research Europe GmbH, 
Freiburg, Germany) prior to RNA isolation. To check the abun- 
dance in vegetative tissue, leaf material was used. In the case of 
C. papaya no leaf material was available so fruit exocarp tissue was 
utilized instead. To check expression in P. patens 6-week old game- 
tophores were used for RNA Isolation. Isolated RNA was treated 
with DNase I (Thermo Fisher Scientific, Germany). To check con- 
tamination with genomic DNA in the treated RNA, a PCR was 
performed with RNA as a template. No contamination was found 
in the RNA samples after DNase treatment (data not shown). 
cDNA was produced using an anchored oligo dT Primer with 
the Maxima H Minus Reverse Transcriptase Kit (Thermo Fisher 
Scientific, Germany) using 2-4 (xg of total RNA as a template for 
the RT-reaction. 

MOLECULAR CHARACTERIZATION 0FSP011 

Reverse transcribed cDNA was used as a template for a PCR reac- 
tion using 50 amplification cycles. The resulting PCR products 
were purified using the GeneJET PCR purification Kit (Thermo 
Fischer Scientific, Germany) and cloned into the insTA-cloning 
vector system (Thermo Fischer Scientific, Germany). Resulting 
clones were screened in a colony PCR using Ml 3 Primer. Clones 
differing in the size of their insert were sequenced and analyzed 
using MegAlign. 

RESULTS 

IDENTIFICATION OF SP011 HOMOLOGOUS AMONG THE PLANT 
KINGDOM 

The progress in sequencing and the growing amount of data 
input into the sequence databases provided us with a pow- 
erful tool for the identification of putative homologous pro- 
teins in a rapidly growing number of organisms by database 
searches using common bioinformatics tools such as BLAST- 
programs (TBLASTN = protein sequence search against the 
respective genomic sequence). By using known sequences of 
SPOll from A. thaliana and O. sativa we were able to iden- 
tify orthologs to SPOll-1 and -2 in all publicly available land 
plant genome assemblies sequenced to date. The identities of the 
orthologs to SPOll-1 from A. thaliana ranges between 95.9% 
for Arabidopsis lyrata to 45.4% for P. patens. The identities of 
the orthologs to SPOl 1-2 from A. thaliana is comparable to the 
identities found for SPOll-1. For A. lyrata the identity is 96.9% 



Frontiers in Plant Science | Plant Genetics and Genomics 



May 2014 | Volume 5 | Article 214 | 2 



Sprink and Hartung 



The splicing fate of plant SP011 genes 



and the least identity is found again for P. patens with 47.5% 
(Supplemental Tables 2, 3). In both cases, the monocotyledonous 
plants show approximately 10% less identity compared to the 
dicotyledonous plants representing the earlier split of mono- and 
dicots (Supplemental Tables 2, 3). 

In our database analyses we found orthologs of SPOll-1 
and -2 in all land plants with completely sequenced genomes. The 
conserved gene structure of SPOll-1 in land plants contains 15 
exons and 14 introns in the coding region. This structure has 
been verified earlier by sequencing of the cDNAs from A. thaliana 
and O. sativa (Hartung and Puchta, 2000; Jain et al., 2006). 
In a large number of cases, the annotation of these orthologs 
corresponded to the known cDNAs but in several cases the cor- 
respondence was incomplete. In virtually all of the latter cases 
we could perform a manual correction according to the known 
sequences. In the Asterid Utricularia gibba we found that intron 
number one was missing, clearly indicating an intron loss event 
in this species. In Table 1 the predicted position and phase of 
the introns in relation to their deduced protein sequence is given. 
All plants with a completely sequenced genome possess SPOll-2 
and show a conserved gene structure concerning the position 
of the 10 introns in the coding region of SPOll-2 (Table 1). 
However, we can identify three exceptions. Firstly, Mains domes- 
tica, Prunus persica, Vitis viniferis, Fragaria vesca, and Eucalyptus 
grandis all miss the first intron so it has most probably been lost 
in a common ancestor of these species. Secondly, in some rice 
species a loss event of intron two occurred, as this intron is miss- 
ing only in O. sativa and O. glaberrima. This intron loss event 



must have occurred recently as the close relative O. brachyan- 
tha contains intron two. Thirdly, the plant Aquilegia coerulea, 
belonging to the Ranunculaceae, encodes for a SPOll-2 gene 
which does not contain a single intron (Supplemental Figure 1). 
Most probably this SPOll-2 gene is a reinserted copy of a fully 
spliced reverse transcribed mRNA, a mechanism which is also 
proposed to have resulted in the origin of SPOll-3 (Hartung 
et al., 2002b). 

Considering all this, it is very clear that SPOl 1-2 existed before 
the evolution of land plants that took place approximately 450 
mya, exemplarily shown by the SPOll-2 sequence (genomic and 
cDNA) of the moss P. patens, an extant member of one of the 
oldest land plant lines (Supplemental Figure 1). However, there 
is a recognizable gap of conservation considering a second or 
third SPOll gene in green algae and other algae that belong to 
the heterokontophyta or rhodophyta. All fully sequenced green 
algae contain a single SPOl 1 gene that shows the highest sequence 
identity to SPOl 1-3 from land plants. In all of these algae, the sec- 
ond subunit TOPVIB is also present as has been shown earlier by 
Malik et al. (2007). This indicates that like land plants, algae most 
probably possess a functional complex of TOPVIA and B. A very 
interesting feature of the SPOl 1-3 gene structure in green and 
other algae is that this gene possesses a high number of introns 
(14 in Chlamydomonas reinhardtii) that are not correlated to the 
introns found in plant SPOll-1 or -2, whereas SPOl 1-3 in land 
plants possesses only one intron (whose position is correspond- 
ing to intron no. 6 of CreSPOll-3) or none at all (Supplemental 
Figure 2). 



Table 1 | Intron localization of A. thaliana, H. sapiens, and the SP011 genes from the two fungi C. cinerea (Basidiomycota) and C. grayi 
(Ascomycota) with respect to their corresponding amino acid sequence positions. 

Intron no. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 End 



Athpos. (aa) 18.3 51.6 76.3 110.3 135 140.6 164 176.3 192.3 212 222.6 258 298.3 319.6 362 

Hsapos. (aa) 43.6 81.6 111.3 133.6 170 199 211.3 248 281.3 294 319.6 357 396 

IPSP011 22120 01 01020 



Ccipos. (aa) 68.6 99.3 121.6 140 1 58 1 87 215 239 261.6 312.6 331 344.6 376.6 401 

|^| 0 |^^^^^| 0 2 2 0 2 2 

Cgr pos. (aa) 74.3 96.6 133 164 190 215 378 

IPSP011 12 0 0 0 0 



Athpos. (aa) 28 56 99.3 145.3 175 218 249.6 270.6 296.3 339 383 

IP-SP011-2 0 b 0 C 1 1 0 2 2 10 

The numbering of introns was done with respect to the highest number of 14 introns in Arabidopsis SPO 1 1- 1. Gaps are included in the other lines to better visualize 

the conserved intron positions. 

a This intron has been lost in Utricularia gibba. 

b This intron has been lost in Fragaria vesca, Malus domestica, Mimulus guttatus, Prunus persica, and Vitis vinifera. 
c This intron has been lost in Oryza brachyantha and Oryza sativa. 

'This intron number 5 of C. cinerea is in the same conserved position as intron number 5 of Arabidopsis SPOll-1 and H. sapiens but is preceded by a non-conserved 
intron position (no. 4). 

Color coding: Orange, intron position conserved at least since the split of the plant and animal kingdom, sometimes (8 and 12) lost later on in fungis; Yellow, intron 
position conserved between H. sapiens (as representative for animals) and two fungal divisions. Abbreviations: IP Intron position; Ath, Arabidopsis thaliana; Cci, 
Coprinopsis cinerea; Cgr, Cladonia grayi; Hsa, Homo sapiens. 
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Malik et al. (2007) performed extensive phylogenetic analy- 
ses in which they described a second SPOll gene that can be 
found in chlorophyta (prasinophyceae), rhodophyta, and het- 
erokontophyta and is by its sequence homology most related to 
SPOll-2 of plants. However, a meiotic function of the gene has 
not been demonstrated for any of these organisms so far, and 
additionally, the gene structure is highly different compared to 
SPOll-2 from land plants (Supplemental Figure 2). The SPOll-2 
similar genes of phylogenetically very different algae either pos- 
sess no intron at all, or a much smaller number of introns in 
positions that are not correlated to the highly conserved posi- 
tions found in all land plant SPOll-2 orthologs (Supplemental 
Figure 2). Taking all data together, two very early duplications 
of the original SPOll -3 (which is orthologous to TOP6A from 
archea) must have occurred, followed by a number of losses in 
different kingdoms. 

This raises the question if SPOl 1-2 from algae is really orthol- 
ogous to SPOl 1-2 from land plants. To address this question, we 
can use the method of comparison of intron positions which we 
already developed earlier (Hartung et al., 2002b). In brief, after 
the alignment of the protein sequences, each intron position is 
projected onto these sequences which can result in an intron 
located in between two coding triplets (phase 0) or interrupt- 
ing a coding triplet after the first or second nucleotide (phase 1 
and 2 which results in e.g., amino acid 18.3 or 18.6, respectively). 
Doing so for all genes, we can clearly see that six intron positions 
in SPOll-1 are conserved throughout the animal and plant king- 
dom, spanning a time frame of almost one billion years (Table 1; 
Hartung et al., 2002b). These introns are number 3, 5, 7, 8, 10, and 
12 with respect to the AthSPOll-1 gene (Table 1). The ancient 
intron positions 8 and 12 most probably have been lost after 
the divergence of plants and animals/fungis in the fungi king- 
dom only. Furthermore, even one intron of SPOll-2 (no. 6) is 
somehow conserved with respect to fungal SPOl 1 which is a sin- 
gle copy SPOll (Hartung et al, 2002b). These conserved intron 
positions cannot be found in the second SPOll copy in algae or 
protists (Supplemental Figure 2). Considering this, we think that 
the second SPOll in protists and algae is an ortholog of plant 
SPOll-2 due to its sequence conservation but a lot of changes 
concerning its gene structure have taken place during evolution 
(Malik et al., 2007; this work). 

ANALYSIS OF SP011 cDNAs 

Based on the obtained database sequences, we designed primer 
pairs to amplify the whole coding sequence (CDS) of SPOl 1-1 
and SPOll-2 from B. rapa, C. papaya, O. sativa, and P. patens. 
The predicted models fit the amplified CDSs in all cases. Using 
preamplified cDNA of the corresponding species, both SPOll 
could be amplified in their full length from C. papaya, B. rapa, 
and A. thaliana. From P. patens and O. sativa only SPOl 1-1 
could be amplified as a full length construct, for SPOl 1 -2 from 
P. patens two overlapping fragments were amplified, sequenced, 
and artificially put together afterwards. For O. sativa no full length 
construct of SPOll-2 could be amplified due to high GC con- 
tent in the 5' region (GC > 80%). Every time we tried to evaluate 
SPOll-2 all constructs were artificially modified due to a repeti- 
tive sequence in the 5' region. Due to this artificial error SPOll-2 



from O. sativa was not further analyzed in detail. In this region 
the PCR leaped directly from one repetitive sequence to the next, 
resulting in constructs without a methionine that could not pos- 
sibly be spliced in a natural way. The structures of the SPOl 1-1 
and -2 genes are shown schematically in Figure 1. In all cases, 
SPOll-1 consists of 15 exons and 14 introns. SPOll-2 codes for 
1 1 exons interrupted by 10 introns in all cases, except for O. sativa 
and O. brychyantha in which intron 2 has been lost. The CDS and 
protein length of each analyzed SPOll is shown in Table 2. 

Full length cDNAs were assembled from the RT-PCR 
data compared to the genomic sequences in the databases. 
Astonishingly, in our attempts to amplify the cDNA by RT-PCR 
for each gene we barely found one clearly distinguishable band. 
In most cases, more than one band accompanied with a smear 
was visible in the ethidium bromide stained gel (Figure 2). After 
cloning and sequencing of the PCR-products we were able to 
identify different alternatively spliced variants for both SPOll 
cDNAs. 

PATTERN OF ALTERNATIVE SPLICED SP011 

In the course of analyzing the patterns of alternative splicing 
events for SPOll, different splicing events which lead to puta- 
tive non-functional proteins could be detected (Figure 3). In most 
cases we found intron retention (IR) mostly leading to a pre- 
mature termination codon (PTC) and an altered length of the 
putative proteins. In some cases exon skipping (ES) occurred and 
we also observed events with altered 5' or 3' splice sites (alt 5'ss 
or alt 3'ss) leading to shorter or longer exons which led to the 
integration of PTCs in most cases. 

When comparing the patterns of alternative splicing events 
of SPOll-1 in vegetative and generative tissue we could only 
detect very few events with a matching pattern in both tissue 
types (Supplemental Table 4). Furthermore, these patterns are 
also different between the analyzed plants. We found no con- 
served alternatively splicing events between two different plants 
in our analyses, indicating that the events are species and tissue 
specific. 

Analyzing A. thaliana SPOll-1 (Figure 3A), a total of eight 
alternative splicing events could be found (P-l). From these, five 
events were IR ({5-?), one alt 5'ss (6), one alt 3'ss (r|), and one alt 
3'ss combined with IR (i). All alternative splicing events resulted 
in altered putative truncated proteins varying from 69 amino 
acids (aa) to 324 aa in length instead of 362 aa (Supplemental 
Table 4). For A. thaliana SPOll-2 (Figure 3a), six alternative 
splicing events could be observed (|3-r|), three IR events (|3-o), 
one alt 5'ss (e), one alt 5'ss combined with IR (£), and one alt 
3'ss combined with ES (r|). Five forms result in PTC and putative 
truncated proteins ranging from 52 to 305 aa instead of 383 aa. 
One form missing exon 3 and parts of exon 4 does not contain a 
PTC and is leading to a putative functional protein of 303 aa (r|) 
(Supplemental Table 4). 

The analysis of SPOll-1 alternative splicing events in B. rapa 
revealed five different forms ({$-?), which consist of two IR 
(P,y), two alt 3'ss (e,t,), and one combination of ES with IR (8) 
(Figure 3B). Leading to one alternative splicing event without 
PTC where the protein length is shortened by 9 aa (e). All other 
events lead to PTC and therefore the putative protein sequences 
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A lkb 2kb 3kb 4kb 
1 1 1 1 

1 2 3 _ 4 5 _ 6 _ 7 _ 8 9 _ 10 _? 1 12 13 14 15 

Ath SP011-1 

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 

B?a SPOllT 

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 



Cpa SPOll-1 

1 2 3 4 5 6 7 8 9 10_11 12 13 1 

Ppa SPOll-1 

1 _2_3_"_5 6 7 _8 9 10 11 12 13 14 15 

Osa SPOll-1 



B lkb 2kb 3kb 4kb 
1 1 1 h" 

1 2 3 4 5 _ 6 _ 7 8 9 10 11 

Ath SPOll-2 

1_2_ 3 4 _5_ 6 _ 7 _8_9_10_11 

Bra SPOll-2 

12 3 45 6789 10 11 

CpaSP011~2 ~ 

1 2 3 456789 10 11 



Ppa SPOll-2 

1 2/3* 4 5 6 7 8 9 10 11 



Osa SPOll-2 



FIGURE 1 | The in-scale exon-intron organization of SP011-1 (A) and 
SPOll-2 (B) for five analyzed species. Ath, Arabidopsis thaliana; Bra, 
Brassica rapa; Cpa, Carica papaya; Ppa, Physcomitrella patens; Osa, Oryza 



sativa. Coding regions are represented by gray boxes. The introns are 
represented by black lines. * Intron 2 has been lost in OsaSP011-2. For a 
better comparison exon 2 was marked with 2 and 3 due to their fusion. 
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FIGURE 2 | Semiquantitative RT-PCR of SP011-1 and -2 from 
Arabidopsis thaliana (A) and Brassica rapa (B). 1 \il of each cDNA was 
used for the PCR reaction. In the case of SP011-1 , distinct bands are 
visible. The lower band represents the ct form of SP011-1, the others are a 
mixture of other forms. The same holds true for SPOTl-2. 



Table 2 | Length of the coding sequence and the respective deduced 
protein length of SP011-1 and -2 from different species. 



Organism 


Gene 


CDS length (bp) 


Protein length (aa) 


Arabidopsis thaliana 


SP01 


-1 


1089 


362 




SP01 


-2 


1152 


383 


Brassica rapa 


SP01 


-1 


1089 


362 




SP01 


-2 


1143 


380 


Carica papaya 


SP01 


-1 


1086 


361 




SP01 


-2 


1149 


382 


Oryza sativa 


SP01 


-1 


1146 


381 




SP01 


-2 


1158 


385 


Physcomitrella patens 


SP011-1 


1086 


361 




SP01 


-2 


1113 


370 



Abbreviations: bp, basepair; aa, amino acid. 



were truncated ranging from 82 to 153 aa instead of 362 aa 
(Supplemental Table 4). In the case of B. rapa SPOll-2, five alter- 
native splicing events were detected (P-?). All of them had one or 
more IR (Figure 3b), four of them with a PTC putatively lead- 
ing to truncated proteins between 32 and 268 aa length. One IR 
event, the retention of intron 10 (8), did not lead to a PTC result- 
ing in an altered putative protein with 404 aa instead of 380 aa 
(Supplemental Table 4). 

The evaluation of the alternative splicing events in SPOll-1 
from C. papaya revealed the highest number of 11 alternative 
splicing events (|3-|x), all differing in type (Figure 3C). We found 
IR, ES, alt 5' and 3'ss as well as all kinds of combinations between 
those types. All constructs contained a PTC leading to putative 



truncated proteins ranging from 30 to 210 aa in size, instead of 
361 aa (Supplemental Table 4). When looking at CpaSPOll-2, five 
different alternative splicing events were detected (f5-?). All had 
IR but also one combination of IR with an alt 3'ss was detected 
(5) (Figure 3c). Four events lead to PTC and putative proteins 
between 97 and 270 aa instead of 382 aa. One event could lead 
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FIGURE 3 | Schematic unsealed schema of the different splice forms 
of SP011-1 (A-E) and -2 (a-e) from Arabidopsis thaliana (A,a), 
Brassica rapa (B,b), Carica papaya (C,c), Oryza sativa (D,d), and 
Physcomitrella patens (E,e). Exons are numbered and shown as white 
blocks, spliced introns as black lines. Intron retention events are 
illustrated as black boxes, alternative 5' splice site selection are shown 
as blue boxes and alterative 3' splice site selection as light green boxes. 



In the case of exon skipping the corresponding white box is missing. 
Splicing forms are named in Greek letters. Splice forms found in 
generative tissue are marked with a red bar; splice forms found in 
vegetative tissue are marked with a green bar. Splice forms found in both 
tissues have both bars. Putative functional forms are marked with an 
asterisk. Due to high GC content and resulting PCR failure, amplification 
of OsaSP011-2 was only possible from exon 2 so exon 1 is not indicated. 



to an altered protein with 410 aa in length containing intron 9 (y) 
(Supplemental Table 4). 

In O. sativa we were only able to analyze the alternative splic- 
ing events for SPOll-1, due to the fact that SPOll-2 has a 
very high GC content in the 5' region of its genomic coding 
sequence. This high GC content prevented successful amplifi- 
cation of the cDNA up to exon 2. In the case of SPOll-1 
we identified six alternative splicing events (|3-T|). We found IR 
as well as a combination of alt 5' and 3'ss with and without 
IR (Figure 3D). Five of these constructs lead to PTC result- 
ing in altered putative protein lengths between 109 and 237 aa 
instead of 381 aa. One construct with a shortened exon 1 and 
2 did not lead to a PTC (y) and results in a truncated puta- 
tive protein with the length of 350 aa (Supplemental Table 4). 
Despite the problems with PCR amplification, we identified one 



alternative splicing event (Figure 3d), containing intron 7 for 
SPOll-2. 

Looking at P. patens, we could only find one alternative splic- 
ing event for each SPOll (Figures 3E,e). In SPOll-1, intron 8 
was retained resulting in a PTC and a putative shortened protein 
of 181 aa instead of 361 aa (Supplemental Table 4). In SPOll- 
2, exon 2 was skipped without causing a PTC, but generating a 
putative truncated protein with a length of 342 aa instead of 372 
aa (Supplemental Table 4). 

The majority of alternative transcripts found in these exper- 
iments lead to putative non-functional proteins. Only a small 
number of alternative transcripts may lead to functional pro- 
tein forms these transcripts were exclusively found in generative 
tissue and were outnumbered by the alternative transcripts which 
contained a PTC. 
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FIGURE 4 | Proposed evolution scheme of SP011 by two duplications 
and different loss events. The proposed evolution of the three different 
SP011 genes nowadays found in land plants is shown schematically. 
Whereas bacteria do not possess a topoisomerase 6, LAECA has 
developed a topoisomerase type 6 from which the subunit TOP6A is 
orthologous to SPOTl-3 in eukaryotes. Two duplication events of SP011-3 
took place after separation of eukarya from archaea resulting in the 
additional SPOTI-1 and SP011-2 genes. In different phyla loss events of 
either SP011-1 or -2 occurred. After separation of the animal and fungal 
kingdom SP011-2 and -3 as well as TOP6B must have been lost resulting in 
the remaining single SP011 gene present in these two kingdoms. 
Abbreviations: LUCA, last universal common ancestor; LAECA, last 
archaeal-eukaryal common ancestor. The term LAECA was taken from 
Forterre (2013). 



DISCUSSION 

EVOLUTION OF DIFFERENT SP011 GENES 

The time frame of SPOll gene evolution remains unclear as a 
second SPOll copy must have arisen very early, most proba- 
bly by gene duplication and subsequent divergence of the two 
genes. The most likely scenario is that SPOll-3, which shows 
by far the best sequence homology to TOPVIA from archaea 
and additionally is still functional and interacting with TOPVIB 
in plants, was the ancestor of gene duplications giving rise to 
other SPOll copies (Hartung et al., 2002a; Malik et al., 2007). 
The phylogenetic sequence homology of SPOll-2 to the second 
SPOll found in protists shown by Malik et al. (2007) favors this 
gene as the first result of duplication and speciation. However, as 
we could show earlier and sustain here, SPOll-1 from plants is 
clearly orthologous to SPOl 1 from fungi and animals, indicating 
a very early appearance of this gene by duplication of SPOl 1-3 
(Hartung et al., 2002a; Forterre et al, 2007; this work). Therefore, 
in our opinion a duplication of the ancestral SPOl 1-3 must have 
occurred twice and very early giving birth to SPOll-1 and -2 
that currently we can find either in animals and fungi (SPOll-1) 
or algae and protists (SPOll-2). The organisms that currently 
only contain SPOll-1 must have lost the other copies, whereas 
protists that contain SPOll-2 and -3 orthologs have lost only 
SPOll-1 (Figure 4). Finally, in land plants all known copies of 
SPOll are still encoded and active as we and others have show 
for all three SPOll genes earlier (Grelon et al, 2001; Hartung 
et al, 2002a,b, 2007; Sugimoto-Shirasu et al, 2002; Stacey et al., 
2006). In addition, SPOll-3 is present together with the second 
subunit TOPVIB, not only in plants but also in all so far inves- 
tigated green algae and protists, which is not the case in animals 
and fungi (Malik et al, 2007) (Figure 4). This points to a con- 
served and linked function of both gene products together as we 
and others have shown for Arabidopsis (Hartung et al., 2002b; 
Sugimoto-Shirasu et al, 2002). 

Nevertheless, the exact evolution and function of two SPOll 
in plant meiosis is still enigmatic. We show that both meiotically 
active SPOll genes are undergoing an extremely complicated 
splicing procedure leading to high numbers of mostly aberrant 
alternative splice products. Despite the very high conservation 
of the gene structure for SPOll-1 and -2, whose introns are 
in virtually 100% identical positions throughout all land plants, 
the alternative splicing seems to be regulated specifically in each 
species. It is not clear whether all different splicing forms of 
SPOll found in this study are real alternative spliced transcripts 
or if some may result from sampling unprocessed pre-mRNAs 
or genomic DNA contamination. However, there are some clues 
that the identified alternative splicing patterns are real events. (1) 
The pattern is found for both SPOll in a similar rate and the 
same as described by Hartung and Puchta (2000), (2) the pat- 
tern is conserved between different species, (3) amplification of 
genomic DNA was not possible (Supplemental Figure 3A) and 
(4) of the analyzed meiotic genes, only SPOll-1 and -2, PHS1 
and VIP 3 show this pattern (Supplemental Figure 3B). An alterna- 
tive splicing pattern was described for VIP 3 and SPOll-1 earlier 
(Hartung and Puchta, 2000; Terzi and Simpson, 2009). This study 
is slightly differing in the findings for SPOll-2 from the study 
done by Hartung and Puchta (2000), due to the fact, that we 



now took a closer look especially on SPOll-2 and used a differ- 
ent protocol for RT-PCR combined with a higher number of PCR 
cycles. The conservation of alternative splicing between ortholo- 
gous genes has been described in A. thaliana and O. sativa (Wang 
and Brendel, 2006). For this reason, it is not extraordinary that 
the alternative splicing is conserved not only between A. thaliana 
and O. sativa but also between the other analyzed species. Wang 
and Brendel (2006) also reported that the type of alternative splic- 
ing is more conserved than the respective intron which is spliced, 
also seen for SPOll-1 and -2 in this study. 

Having a look at another kingdom in the eukaryotes previous 
studies showed also for mouse and human a pattern of alterna- 
tive spliced transcripts for SPOll (Shannon et al, 1999). In this 
previous work various alternative spliced transcripts were identi- 
fied. Most of them were not further analyzed, but two transcripts 
variants with the expected size code for functional proteins. These 
two forms, SPO 1 1 -a and SPO 1 1 - p" differ only in the abundance of 
exon 2. SPOll-a is missing exon 2 resulting in a shortened pro- 
tein. The same forms were found in humans (Romanienko and 
Camerini-Otero, 1999). We were not able to find splicing forms 
equivalent to SPOll -alpha/beta from mammals due to the fact 
that the protein sequence in this area has not much homology 
to SPOll from plants. But we were able to find other putative 
functional forms in plants as shown in Figure 3. The fact, that 
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alternative splicing of SPOll is also common in other kingdoms, 
let us suggest that this mechanism is highly conserved and might 
have a regulating function. 

SP011 AND THE NMD PATHWAY 

Many aspects are known to initiate non-sense mediated decay in 
plants. It was shown that long 3' untranslated regions (UTRs) 
as well as an intron in the 3'UTR can trigger the NMD pathway 
(Kertesz et al., 2006). We could previously show that A. thaliana 
SPOll-1 and -2 both harbor an intron in the 3'UTR and show 
different poly A sites, which sometimes results in long 3'UTRs 
(Hartung and Puchta, 2000). In this study we determined vari- 
ous poly A sites of SPOll in O. sativa and C. papaya (data not 
shown) that affect the position of the poly A tail and sometimes 
lead to long 3'UTRs. Another aspect which may lead to non-sense 
mediated decay besides a long 3'UTR are upstream open reading 
frames (uORFs) adjacent to the start codon of the gene (Nyiko 
et al, 2009). Analyzing the 5'UTR of A. thaliana SPOll-1 and -2, 
we could identify in both cases long uORFs. For other species such 
as C. papaya and O. sativa, such long and adjacent uORF could 
not be found for both SPOll. However, for all analyzed species 
we were able to identify alternative splicing events that lead to 
PTCs which are presumed to be targeted by the non-sense medi- 
ated decay pathway (for recent review see Reddy, 2007). In plants, 
many pathways such as the circadian clock and the flowering time 
are controlled via alternative splicing of core genes (James et al., 
2012; Staiger and Brown, 2013). Alternative splicing and various 
polyadenylation has been reported for VIP 3 during flower devel- 
opment of Arabidopsis (Terzi and Simpson, 2009). VIP 3 is the 
Arabidopsis ortholog of SKI 8 from yeast, one of the described 
direct interaction partners of SPOll in Saccharomyces cerevisiae 
(Arora et al., 2004). There must be a reason for the conserved 
alternative splicing of SPOll-1 and -2 in plants. One possibil- 
ity could be that SPOll is controlled in a precise way via the 
pathways of alternative splicing and non-sense mediated decay. 
The NMD pathway offer a mechanism which is routinely used by 
mammals and others to regulate gene expression (Lareau et al., 
2004; Lejeune and Maquat, 2005). Such effects were observed for 
mice and men where the splicing of SPOl 1 and other meiosis spe- 
cific genes are regulated during meiosis (Habu et al., 1996; Schmid 
et al., 2013). It has long been known for yeast that genes which are 
involved in meiosis show alternative splicing (Engebrecht et al., 
1991; Guisbert et al., 2012). Considering that the number of pos- 
sible NMD candidates in plants are quite similar to the frequency 
observed for humans, it seems likely that plants may also use non- 
sense mediated decay and alternative splicing for gene regulation 
in a comparable way (Lareau et al., 2004; Wang and Brendel, 
2006). 

While further analyses on the localization of the alternative 
spliced isoforms need to be done, this study revealed differences in 
the alternative spliced forms of SPOll-1 and -2 between genera- 
tive and vegetative tissue. Such tissue specific regulation of NMD 
was shown before. Especially in mammals this has been studied 
recently (Zetoune et al., 2008; Huang and Wilkinson, 2012) An 
accurate differentiation between single cell types could give closer 
insight into the alternative splicing during pre-meiotic and mei- 
otic stages as done for yeast and mammals (Engebrecht et al., 



1991; Schmid et al., 2013). The very weak expression especially 
for SPOl 1-2 could make this a challenging task. Up to now lit- 
tle is known about the function of the conserved domains in 
SPOll (Bergerat et al., 1997). A closer look and more infor- 
mation on those domains could contribute to the understand- 
ing of the putative function of the alternative spliced isoforms. 
Investigating nmdT^ mutants could provide us with more infor- 
mation about the potential regulation of SPOll-1 and -2 via 
NMD in Arabidopsis. In previously published studies, SPOll 
mRNA was not captured mostly due to its weak expression and 
inadequate conditions for the amplification of SPOll (Simpson 
et al, 2008; Kalyna et al, 2012). Taking a closer look at SPOll 
expression in these plants would be of great advantage. 
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